Memory Memory Memory Compute Processor Compute Processor Compute Processor Interconnection

نویسنده

  • David Kotz
چکیده

Many scienti c applications that run on today s multiprocessors such as weather forecast ing and seismic analysis are bottlenecked by their le I O needs Even if the multiprocessor is con gured with su cient I O hardware the le system software often fails to provide the available bandwidth to the application Although libraries and enhanced le system interfaces can make a signi cant improvement we believe that fundamental changes are needed in the le server software We propose a new technique disk directed I O to allow the disk servers to determine the ow of data for maximum performance Our simulations show that tremendous performance gains are possible Indeed disk directed I O provided consistent high performance that was largely independent of data distribution obtained up to of peak disk bandwidth and was as much as times faster than traditional parallel le systems

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

Memory Bound vs. Compute Bound: A Quantitative Study of Cache and Memory Bandwidth in High Performance Applications

High performance applications depend on high utilizations of bandwidth and computing resources. They are most often limited by either memory or compute speed. Memory bound applications push the limits of the system bandwidth, while compute bound applications push the compute capabilities of the processor. Hierarchical caches are standard components of modern processors, designed to increase mem...

متن کامل

Beyond Processor-centric Operating Systems

By the end of the decade, computing designs will shift from a processor-centric architecture to a memorycentric architecture. At rack scale, we can expect a large pool of non-volatile memory (NVM) that will be accessed by heterogeneous and decentralized compute resources [3, 17]. Such memory-centric architectures will present challenges that today’s processor-centric OSes may not be able to add...

متن کامل

Cost-Performance Evaluation of SMP Clusters

Clusters of Personal Computers have been proposed as potential replacements for expensive compute servers. One limitation in the overall performance is the interconnection network. A possible solution is to use multiple processors on each node of the PC cluster. Parallel programs can then use the fast shared memory to exchange data within a node, and access the interconnection network to commun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994